Picture for Bharath Hariharan

Bharath Hariharan

Cornell University

Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes

Add code
May 22, 2026
Viaarxiv icon

Flat-Pack Bench: Evaluating Spatio-Temporal Understanding in Large Vision-Language Models through Furniture Assembly

Add code
May 20, 2026
Viaarxiv icon

$Δ$ynamics: Language-Based Representation for Inferring Rigid-Body Dynamics From Videos

Add code
May 20, 2026
Viaarxiv icon

CityRAG: Stepping Into a City via Spatially-Grounded Video Generation

Add code
Apr 21, 2026
Viaarxiv icon

WildDet3D: Scaling Promptable 3D Detection in the Wild

Add code
Apr 09, 2026
Viaarxiv icon

Live Interactive Training for Video Segmentation

Add code
Mar 27, 2026
Viaarxiv icon

When the City Teaches the Car: Label-Free 3D Perception from Infrastructure

Add code
Mar 17, 2026
Viaarxiv icon

On the Feasibility and Opportunity of Autoregressive 3D Object Detection

Add code
Mar 09, 2026
Viaarxiv icon

MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark

Add code
Jan 05, 2026
Viaarxiv icon

Tracking and Understanding Object Transformations

Add code
Nov 06, 2025
Viaarxiv icon